BioData Mining
○ Springer Science and Business Media LLC
Preprints posted in the last 7 days, ranked by how well they match BioData Mining's content profile, based on 15 papers previously published here. The average preprint has a 0.03% match score for this journal, so anything above that is already an above-average fit.
Sullivan, C. R.; Anderson, S.; Caola, L.; Rawstern, T.; Loleng, J.; Roghair, J.; Dastin-Van Rijn, E.; Gustafson, K.; Randolph, A.
Show abstract
We assembled a multimodal clinical dataset describing demographics, placement history, prenatal substance exposure (PSE), birth characteristics, adverse childhood experiences (ACEs), International Classification of Diseases (ICD) diagnoses, and laboratory results for 3,685+ pediatric patients evaluated between 2014 and 2024 at the University of Minnesotas Adoption Medicine Clinic (AMC). Data were curated from electronic medical records through a combined manual and automated extraction protocol using a standardized operating procedure. The resulting dataset integrates structured EMR fields including neuropsychological, laboratory, and diagnostic information with manually pulled fields of ACE scores, PSE history, and placement history. We provide an overview of the population represented and describe the datasets structure, variable definitions, and validation procedures. This resource enables investigations into how early adversity impacts medical and developmental outcomes, and provides one of the largest standardized clinical placement history, PSE, and ACE datasets in an adoption and foster care pediatric population.
Chawla, A.; Carter, S.; Wood, A.; Staffieri, S.; Dodgshun, A.; Eisenstat, D.; Sullivan, M.
Show abstract
Background: Platinum-based chemotherapy is known to cause severe and debilitating hearing loss, but unlike cisplatin, the true incidence of carboplatin-induced hearing loss remains unclear. We evaluated functional hearing outcomes in children receiving carboplatin to determine the incidence and severity of ototoxicity. Procedure: We identified a large cohort of children with cancer treated with carboplatin and graded their audiograms using the SIOP ototoxicity scale. Patients with inadequate audiological follow-up, prior hearing loss, or exposure to cisplatin were excluded. Fishers exact test, logistic regression, and ROC analyses were performed to investigate associations of demographic, treatment, and exposure-related risk factors with incidence of hearing loss. Results: 200 patients were included, all of whom had been treated with carboplatin. Only nine (4.5%) patients developed clinically significant hearing loss (SIOP grade [≥]2). Younger age at first exposure to carboplatin was the only significant predictor of hearing loss (OR = 0.7888, p=0.0241). Age [≤]28 months was significantly associated with hearing loss (OR 12.37, p=0.0042). No other risk factors or exposures were statistically significant. Conclusions: Clinically significant carboplatin-associated hearing loss was uncommon (incidence 4.5%). We show that young age is the single-most important risk factor for hearing loss; of nine children who developed hearing loss, eight were aged [≤]28 months. Children below this age have twelve-fold higher odds of developing hearing loss compared to those above this age (OR 12.37). These findings will allow physicians to provide more appropriate counselling to families regarding ototoxic risk and support intensified hearing surveillance in young children.
Fayette, L.; Brendel, K.; Mentre, F.
Show abstract
Joint modelling of longitudinal data using non-linear mixed effects models and time-to-event outcomes provides a suitable framework to account for informative censoring when estimating biomarker dynamics and quantifying event risk using covariates and longitudinal trajectories. Their usefulness in clinical research depends on data collection design, particularly to precisely estimate the association (link) parameter between longitudinal and survival processes. However, optimal design strategies have so far been addressed separately for longitudinal and survival endpoints and remain unexplored for joint models. We propose two Fisher Information Matrix (FIM) computation methods for joint models, relying on Monte-Carlo integration over observations combined with either Markov Chains Monte-Carlo or Adaptive Gaussian Quadrature to integrate random effects. Their accuracy is assessed against clinical trial simulations in an oncological example based on the HORIZON III study with a tumour-growth-survival model including discrete and continuous covariates. We apply these methods to quantify the impact of follow-up duration, sampling richness, sample size, and covariate distribution on parameter uncertainty and test power. In our example, longitudinal-parameter uncertainty is barely affected by follow-up duration or sampling richness, whereas survival-parameter uncertainty decreases substantially from 1-year to 2-year follow-up. The number of subjects needed (NSN) to achieve <15\% uncertainty on the link parameter is comparable for a 2-year rich design and a 3-year sparse design. Optimal covariate distributions are stable across designs and systematically improve test power, outperforming longer and richer but non-optimised designs. These FIM-based methods accurately predict uncertainty and test powers, enabling design evaluation and NSN computation for joint-model-based clinical studies.
Houghton, A.; Caola, L.; Dastin-Van Rijn, E.; Anderson, S.; Kummerfeld, E.; Sullivan, C.; Simpson, S.; Kalkar, A.; Banerjee, R.; Fiecas, M.; Randolph, A.
Show abstract
Background: Prenatal substance exposure (PSE) occurs when an individual is exposed to substances in utero. PSEs may have lasting effects on mental health. We tested whether PSEs show threshold, cumulative, or individual substance associations with childhood psychiatric diagnoses. Methods: Clinical variables (demographics, ICD-9/10 diagnoses, PSE history) were extracted from electronic health records from the University of Minnesota Adoption Medicine Clinic. PSEs were identified from caregiver and child-protective-services narratives and/or toxicology (cord tissue/blood, meconium). For each ICD-9/10 diagnostic category, we fit logistic regression models comparing (1) exposure thresholds (0, 1, 2, 3, 4+ exposures), (2) a cumulative exposure count, and (3) individual substances to estimate marginal odds ratios (ORs) with 95% Confidence Intervals (CIs). Results: Psychiatric diagnoses increased with the number of PSEs. Relative to no exposure, odds of an Anxiety Disorder rose from OR 1.47 (95% CI 1.16-1.87) with one exposure to OR 2.03 (1.64-2.52) with >=4 exposures. Higher cumulative exposure scores were associated with Anxiety Disorders (OR 1.28, 1.18-1.38), Behavioral and Emotional Disorders (OR 1.42, 1.31-1.54), Substance Use Disorders (OR 1.52, 1.29-1.79), and Mood Disorders (OR 1.16, 1.04-1.30). Alcohol, tobacco, and marijuana exposures were associated with increased odds of at least one psychiatric diagnosis, and each substance showed at least one significant diagnostic cluster when modeled independently. Conclusion: Increasing numbers of PSEs were associated with higher odds of psychiatric diagnoses, with patterns varying by substance and outcome. These findings motivate research on exposure timing and combinations to support earlier identification and intervention for at-risk children.
McBride, F.; Huang, H.; Kapoor, A. K.; Oermann, E.; Frontera, J. A.; Razavian, N.
Show abstract
Background and Purpose Prognostication after acute ischemic stroke often relies on limited variables and simple risk scores, despite richer information being available at admission. We developed a multimodal AI model using admission data to predict modified Rankin Scale (mRS) outcomes and compared it to established tools. Methods In a retrospective study of ischemic stroke/TIA patients, we trained three modality-specific models on admission non-contrast head CT, history and physical notes, and structured clinical variables, and combined them in a weighted-average ensemble. We predicted binary (mRS 0-2 versus 3-6) and ordinal mRS (0-6) outcomes at discharge and 90 days. Performance on an external test cohort was compared with THRIVE and SPAN-100 scores using AUROC, AUPRC, Brier score, mean absolute error (MAE), and quadratic weighted kappa (QWK). Results A total of 6,915 patients were split into training, validation and testing cohorts in a 3:1:1 ratio. For discharge binary mRS (n=1596), the multimodal ensemble achieved significantly better discrimination (AUROC 0.859, AUPRC 0.858) with 25-61% lower Brier scores than THRIVE or SPAN?100 (all p<0.001). For 90?day binary mRS (n=207), the model also outperformed both THRIVE and SPAN-100 (AUROC 0.838, AUPRC 0.805, with 3-38% lower Brier scores). Ordinal mRS prediction showed similarly strong performance with significantly better QWK at discharge and numerically lower MAE. The multimodal ensemble model reassigned about one?third of patients to different risk categories versus THRIVE and was closer to the true discharge outcome in ~74% of discordant cases. Conclusions We developed a well-calibrated multimodal AI model for prediction of discharge and 90-day post-stroke functional outcomes using only data present at the time of admission. This model outperforms existing prognostic tools and can support early clinical decision-making.
Chen, P.-W.; Cielo, C.; Walsh, O.; Mcdonald, M.; Song, P. X.; Goldstein, C.; Moreno, J. P.; Jansen, E.; Mitchell, J. A.
Show abstract
Introduction: Actigraphy sleep-wake classification methods increasingly seek to leverage raw acceleration data and machine-learning-based classification, but performance evaluation in pediatrics is limited. We trained machine-learning models using pediatric data and compared their sleep-wake classification performance with existing algorithms for children. Methods: Sixty-five children (46% female, ages 5.3 to 17.7 years) completed in-lab overnight polysomnography and wore a GENEActiv device on their non-dominant wrist. The acceleration data were converted into 30-second epochs and aligned with physician-scored sleep-wake data from electroencephalography. Seven machine-learning models were trained using leave-one-subject-out cross-validation. Epoch-by-epoch analyses generated performance metrics (e.g., balanced accuracy [BA]) and discrepancy analyses provided overall sleep duration bias estimates. The combination of highest performance and least bias was used to rank using Euclidean distance scores - where a lower score represents closer to perfect performance and zero bias. For benchmarking, we included GGIR sleep scoring algorithms and an adult trained random forest classifier. Results: Overall, 560.1 hours of polysomnography and actigraphy data were collected (74.4% of epochs were scored as sleep). The pediatric-trained local-global long-short term memory (LSTM) classifier had the most optimal epoch-by-epoch performance (e.g., BA=0.85, sensitivity=0.88, specificity=0.83, ROC-AUC=0.95, and Cohen kappa=0.67). These metrics exceeded that of an adult-trained random forest classifier and GGIR-based algorithms. Discrepancy analyses revealed that overall sleep duration was underestimated by an average of 25 minutes using the LSTM classifier with no proportional bias. Conclusion: We trained seven pediatric sleep-wake classifiers that had strong ability to detect sleep and wake, with the LSTM classifier being most optimal.
Randolph, A.; Dastin-Van Rijm, E.; Anderson, S.; Caola, L.; Kummerfeld, E.; Sullivan, C.; Simpson, S.; Kallar, A.; Banerjee, R.; Houghton, A.
Show abstract
Background: Adverse childhood experiences (ACEs) are traumatic or adverse events in early life that can have lasting effects on behavioral, emotional, and psychological functioning. Prior research suggests ACEs relate to later psychiatric outcomes through threshold, cumulative, and individual-specific risk patterns. Few studies, however, have operationalized all three models to test ACE-specific associations with diagnosed psychiatric disorders in individuals who are adopted or with foster care histories. Methods: We conducted a cross-sectional retrospective study using electronic health record data from foster care and adopted patients aged 0-21 years old seen at the University of Minnesota Adoption Medicine Clinic (UMN-AMC) between 2014-2024. Extracted measures included ACE history, demographics, and psychiatric diagnoses. We used latent class analysis and logistic regression to identify clusters of adversity and estimate associations with psychiatric diagnosis domains, adjusting for Sex and Age at Initial Visit. Results: ACEs showed a threshold pattern across psychiatric domains, with higher ACE counts associated with greater odds of psychiatric diagnoses. Individual risk modeling indicated that exposure to abuse or violence was associated with higher odds of psychiatric diagnoses. Across cumulative and individual risk approaches, Anxiety Disorders, Mood Disorders, and Behavioral or Emotional Disorders showed the greatest sensitivity to adversity. Conclusion: Current ACE models may not fully capture neurodevelopmental impacts reflected in diagnosed psychiatric disorders among adolescents, particularly in high-risk groups such as foster and adopted individuals. In a large clinic sample our findings support a nuanced association between ACEs and later psychiatric diagnoses and highlight the need for ACE-focused assessment, prevention, and treatment strategies tailored to foster care and adopted populations.
LAWA GARANDJI, D.; BALDE, A. O.
Show abstract
ABSTRACT Background: Self medication with analgesics and non steroidal anti inflammatory drugs (NSAIDs) is common in low- and middle income countries and may expose users to preventable adverse outcomes. Evidence from Guinea remains scarce. This study aimed to estimate the prevalence of self medication with analgesics and NSAIDs among pharmacy clients in urban Conakry, identify associated factors, and describe clinical risk situations. Methods: We conducted a pharmacy based analytical cross sectional study in 30 private pharmacies across Conakry, Guinea. A total of 1,032 participants seeking analgesics or NSAIDs were enrolled between November 3, 2012, and April 5, 2013. Self-medication was defined as acquisition or use without a valid medical prescription. Factors associated with self-medication were analysed using multivariable logistic regression. Results: Among 1,032 participants, 603 reported self medication (prevalence 58.4%). Previous unsupervised use was reported by 78.7%. The most frequently used medicines were paracetamol (56.9%, n=587), diclofenac (21.3%, n=220), ibuprofen (17.9%, n=185), and aspirin (3.9%, n=40). Overall, 68.0% (n=702) reported no knowledge of potential adverse effects. Clinical risk situations were frequent: gastrointestinal disorders (41.3%, n=426), hypertension (9.2%, n=95), and pregnancy exposure among reproductive age women (26.0%). In multivariable analysis, self medication was independently associated with previous analgesic/NSAID use (aOR = 2.8, 95% CI: 2.1 to 3.6), lack of knowledge of adverse effects (aOR = 1.9, 95% CI: 1.4 to 2.5), informal occupation (aOR = 1.6, 95% CI: 1.2 to 2.2), and age 18 to 59 years (aOR = 1.5, 95% CI: 1.1 to 2.1). Conclusions: In this pharmacy based study conducted in urban Conakry, self medication with analgesics and NSAIDs was common and frequently associated with limited awareness of potential adverse effects. These findings support the need for strengthened pharmaceutical regulation, pharmacist-led counselling, health literacy interventions, and improved access to primary care. Keywords: self medication; analgesics; NSAIDs; paracetamol; diclofenac; ibuprofen; pharmacy; Guinea; Conakry; drug safety; public health.
Periwal, V.
Show abstract
Background: Conventional psychiatric screening instruments summarize symptoms within individual scales and prioritize cases with high single-instrument additive score severity. This design treats items as independent within instruments and ignores cross-instrument covariance structure, making it insensitive to respondents whose responses are distributed across multiple domains in unusual combinations that remain below threshold on every individual scale. Methods: We analyzed two cohorts spanning older and younger adults. Item prompts from depression, stress, anxiety, and sleep instruments were embedded into a shared semantic space using a pretrained sentence encoder. Principal component analysis of the item-prompt embeddings alone---with no use of respondent data at this stage---was used to construct a low-dimensional subspace retaining 80\% of variance in the item embedding matrix. Normalized participant responses were then projected into this subspace, with Jaccard-based stability analysis used as a check on dimensional robustness. Multivariate deviation from the cohort norm was quantified with Mahalanobis distance using Ledoit-Wolf covariance regularization. Candidate outliers were defined by the empirical 95th percentile of the cohort-specific distance distribution. To isolate response configurations not already captured by conventional single-instrument extreme-value logic, we excluded all outlier respondents who had endorsed any individual item at the maximum value of its Likert scale on any instrument. For the remaining outliers, anomalous components were backtracked to their original item loadings for interpretation. Results: In the older-adult Health and Retirement Study (HRS) cohort, principal component analysis of 27 item-prompt embeddings showed that a 10-dimensional subspace provided a stable representation of cross-instrument semantic structure. In the younger-adult Xinxiang cohort the corresponding stable solution was 16-dimensional. In each cohort, seven respondents remained as multivariate outliers despite falling below every single-instrument extreme-value threshold. These cases were not characterized by uniformly severe symptom scores but by unusual cross-domain response configurations that became visible only in the shared semantic covariance subspace. The response structure of the retained configurations differed across cohorts: older-adult cases more often involved weak endorsement of mood-labeled items alongside nonzero body- and sleep-related responses, whereas younger-adult cases more often involved incomplete response configurations spanning mood, sleep, stress, and self-harm-related items. Conclusions: A semantically aligned, auditable covariance subspace provides a practical tool for flagging unusual multivariate response configurations that single-instrument additive screening may not flag. The method is interpretable at the level of original item contributions. It should be understood as a hypothesis-generating screen for unusual response configurations requiring further clinical assessment, not as a diagnostic instrument. Outcome validity remains to be established by prospective study.
Xiao, J.; Zhao, Z.; King, Z. D.; Khalid, M.; Davies, S.; Zanna, K.; Argueta, D. L.; Brice, K. N.; Wu-Chung, E. L.; Lai, V. D.; Paoletti-Hatcher, J.; Denny, B. T.; Henry, S.; Schulz, P. E.; Fagundes, C. P.; Sano, A.
Show abstract
Spousal caregivers of individuals with Alzheimers disease and related dementias frequently experience elevated perceived stress, caregiver burden, and loneliness, which are associated with adverse health outcomes. Early identification is therefore critical for timely intervention. Existing approaches commonly rely on wearable sensor data and standardized psychological questionnaires, while recent multimodal methods aim to improve prediction by integrating behavioral and linguistic information. In this study, we explored three modality configurations, wearable-derived features, interview-based text, and their combination, to classify caregiver psychological risk using the Perceived Stress Scale (PSS), Zarit Burden Interview, and UCLA Loneliness Scale. We compared traditional machine learning models and large language models (LLMs) (Gemini 2.0, Llama 4, and GPT-4o) under psychometrician-centered and caregiver-centered prompting strategies. Traditional machine learning models performed better under multimodal settings, while LLMs achieved stronger performance with Interview-Only input. We further demonstrate that PSS was the most predictable construct and prompting strategies substantially influenced LLM performance.
Plasek, J. M.; Li, Y.; Amato, M. G.; Foer, D.; Seger, D. L.; Alzaidi, S.; Zhou, H.; Jackson, G. P.; Bates, D. W.; Zhou, L.
Show abstract
Background: Adverse drug events (ADEs) are a critical indicator of patient safety but are often documented only in free-text clinical notes. The potential of recent advances in natural language processing (NLP), particularly generative large language models (LLMs), to identify ADEs remains understudied. This study aimed to compare the performance of multiple LLMs in identifying ADE-Drug relationships in inpatient and ambulatory clinical notes. Methods: We used clinical notes from the 2018 National NLP Clinical Challenge (n2c2) ADE dataset (inpatient; n=505) and from outpatient encounters (n=2,555) between October 1, 2018, and December 31, 2019, at a large academic medical center based in New England. Notes were pre-processed into snippets for model input. Evaluated Models included: GPT-4o, GPT-4o-mini, LLAMA 3.3-70B and their instruction fine-tuned variants (including low-rank adapters for LLAMA). Performance was assessed using both strict and relaxed evaluations (precision, recall, and F1) for all models, followed by manual evaluation (exact semantic match, partial match, missing ADE, drug mention only, not a drug, or wrong) of the two best-performing models. Results: GPT-4o and GPT-4o-mini were the top-performing models among those evaluated. GPT-4o consistently outperformed GPT-4o-mini in ADE extraction across both datasets, with higher F1-scores (0.524 vs. 0.381) and a more balanced precision-recall profile. Both models captured ADEs effectively in explicit and complex clinical contexts, although limitations included misclassification of pre-existing allergies and occasional conflation of therapeutic indications with adverse effects. GPT-4o achieved higher exact match coverage and fewer errors across clinical notes, indicating more reliable performance in both inpatient and ambulatory settings. Conclusion: This work establishes a foundation for integrating LLM methods into real-world drug safety surveillance, with direct implications for improving patient safety.
Ahmed, Z.; Govindareddy, P.; DeGroat, W.; Narayanan, R.; Peker, E.; Zeeshan, S.
Show abstract
Precision medicine aims to advance our ability from a "one-size-fits-all" approach to personalized and predictive healthcare across diverse populations. It promotes integration of multi-omics and phenotypic data to understand disease mechanisms and discover novel biomarkers and risk factors, which could be used to predict and prevent critical diseases in individual patients across diverse populations. The potential implications of precision medicine approach can accelerate our ability to classify patients at higher risk of developing critical diseases, improve diagnostic capabilities, develop deeper understanding of individual risk, investigate racial differences and demographic characteristics, and find relationships between genetic variants, expressions, and diseases. This study focuses on implementing an innovative and data driven framework of translational bioinformatics and Machine Learning (ML) techniques to analyze multi-omics, including RNA-seq and Whole-Genome Sequencing (WGS) data, generated using blood samples of randomly consented patients. First, we utilized bioinformatics pipelines to identify differentially expressed genes and their pathogenic and likely pathogenic variants for the downstream data analysis, annotation, and visualization. Then, applied a nexus of ML models for multi-omics biomarker discovery, disease prediction, density-based clustering, single-patient profiling, and pathogenicity classification. WGS data analysis supported the exploration of genetic variation and diversity among patients to identify known and novel biomarkers, whereas RNA-seq data analysis improved our understanding of functional and biological pathways that underlying disease states. We classified and clustered pathogenic variants and expressions across various genes and discovered numerous diseases leading risk factors. Our results include gene-disease associations and captured common pathways across the broader population, demonstrating a level of sensitivity and accuracy that has broad clinical implications. We validated our results through clinical records, and state of the science literature. This study delves into the strengths of multi-omics data integration and capabilities of ML application in genetically diverse and complex patient cohorts. Our approach has the potential to elucidate complex gene-disease interactions for genetically diverse populations, which can support earlier diagnoses for patients in many disease realms.
Reteig, L. C.; Woloshin, S.; Maglione, P. J.; Farmer, J. R.; Ong, M.-S.
Show abstract
Patients with primary immunodeficiency (PID) often face prolonged diagnostic delays and may increasingly turn to large language models (LLMs) to interpret their symptoms during this period. We evaluated whether an LLM could recognize PID from symptom descriptions derived from interviews with 21 PID patients. In a prior study, we showed that GPT-4o identified PID in 96% of cases when prompted with physician-written patient histories (Rider et al., JACI, 2024). Here, when prompted with symptom descriptions in patients' own words, GPT-5 identified PID in only 7 cases (33%), although it more broadly suggested immune system issues in 18 cases (81%). The gap between these findings indicates that LLMs are sensitive to the language and framing of symptom descriptions, performing substantially worse when patients describe their own symptoms in everyday language than when clinicians summarize patient histories in structured medical terms. This study underscores the need to carefully evaluate how LLMs are used in patient-facing applications.
Syvalahti, T.; Tokariev, M.; Nevalainen, P.; Tuiskula, A.; Metsaranta, M.; Haataja, L.; Vanhatalo, S.; Tokariev, A.
Show abstract
Abstract Background Prediction of long-term neurodevelopmental outcomes remains challenging after perinatal asphyxia. Here, we studied whether computational metrics of brain function derived from neonatal EEG are associated with long-term neurodevelopment in infants with perinatal asphyxia. Methods Total of 36 term-born infants with perinatal asphyxia with or without hypoxic-ischemic encephalopathy were studied with neonatal multichannel electroencephalography (EEG). We computed local EEG amplitudes and phase-amplitude coupling (PAC), as well as large-scale functional cortical networks estimated using amplitude-amplitude correlations (AAC) and phase-phase correlations (PPC). These EEG-derived markers were tested for associations with neurodevelopmental outcomes at two years, assessed using the Griffiths Scales of Child Development, 3rd edition (GMDS-III). Results EEG amplitudes showed positive associations with GMDS-III Foundations of Learning and General Development scores across most electrodes during quiet sleep, with the strongest effects observed at frontal and central regions (r = 0.44-0.66). PAC showed negative associations with the same scores mainly over parietal and temporal regions (r = -0.45 to -0.55). Cortical AAC networks demonstrated the most robust and widespread negative associations in all frequency bands during quiet sleep (r = -0.47 to -0.54), with 70-72% of connections significant in high delta frequency. In turn, PPC networks showed frequency-selective and more spatially constrained negative associations during quiet sleep (r = -0.48 to -0.53), involving 5-12% of the network. Conclusions Both local and network-based metrics in the newborn brain show significant association with neurodevelopmental outcome at 2 years after perinatal asphyxia.
Kosola, S.; Moro, S.; Holopainen, E.
Show abstract
Objective: Cross-sectional studies indicate associations between self-reported social media use and adolescent wellbeing outcomes. We aimed to evaluate longitudinal associations of objectively measured smartphone and social media use with psychosocial wellbeing. Design: Observational study with one year of follow-up Setting: High schools in Finland from 2022 to 2023 Population: 259 adolescent girls (mean age 16.3 years at baseline) Main outcome measures: screenshots depicting smartphone and social media use, Bergen Social Media Addiction Scale (BSMAS), Generalized Anxiety Disorder-7 questionnaire, Body Appreciation Scale 2 (BAS-2) and visual analogue scales (VAS) of mood, tiredness, and loneliness Results: Across one year of follow-up, anxiety, body appreciation, and mood improved, but possible social media addiction increased from 15% to 17%. Social media addiction at baseline was associated with increased anxiety (r=0.29, p<0.001), lower body appreciation (r=-0.15, p=0.022), and more loneliness (r=0.20, p=0.001) at follow-up. Anxiety at baseline was associated with social media addiction at follow-up (r=0.19, p=0.005). The highest quartile of TikTok users reported more social media addiction (BSMAS 19 [IQR 16-21] vs. 17 [IQR 14-20]; p=0.009) and lower body appreciation (BAS-2 32 [IQR 28-38] vs. 35 [IQR 29-40]; p=0.003) than did others. The highest quartile of Snapchat users reported more social media addiction (BSMAS 19 [IQR 15-21] vs. 17 [IQR 14-20]; p=0.007) and tiredness (VAS 21 [IQR 13-32] vs. 26 [IQR 15-35]; p=0.049) than did others. Conclusions: Consistent with cross-sectional studies, social media addiction was associated with poorer psychosocial outcomes across follow-up. Policies to protect adolescents from social media addiction are urgently needed.
Bonilla, K.; Sherman, V. M.; Arbaiza, A. S.; Dougherty, M.; Olson, L. E.
Show abstract
In some countries, melatonin is sold without a physician prescription and dosage is unregulated. Transdermal products have become popular including those marketed for children. We measured consumer assumptions about these products among adult residents of the United States, analyzed lot-to-lot variability, and compared the pharmacokinetics of melatonin administered in oral, lotion, and bath product forms. Survey respondents (n=199) believed oral melatonin was more effective than transdermal products and that all melatonin products were relatively safe. Melatonin lotion products analyzed by HPLC displayed lot-to-lot variability as well as changes in formulation and product claims. To determine pharmacokinetics, three different treatments (oral tablets, lotion, and bath immersion) were administered to twelve undergraduate participants in a randomized, crossover design. Five additional participants completed bath product treatment only. Participants collected saliva samples up to 48 hours after administration, which were analyzed for melatonin by enzyme-linked immunosorbent assay. Oral (n=11) and lotion formulations (n=12) caused maximum salivary melatonin levels within 30 minutes after administration, but bath immersion did not cause increases in saliva melatonin (n=17). The half-life of oral melatonin was 1.17 [0.69 -- 1.65] hours versus 5.72 [3.75 -- 7.68] hours for lotion treatment (p = 0.011, effect size r = 0.770). Melatonin lotion may pose a risk to consumers who assume it is safe and less effective than oral tablets, when in fact it may be very potent and remain at high physiological levels into the following day. This study is registered on clinicaltrials.gov (NCT06382610) and was funded by the Sleep Research Society.
Marshall, A. T.; Kan, E.; Adise, S.; König, M.; McConnell, R.; Martinez, M.; Midya, V.; Arora, M.; Sowell, E. R.
Show abstract
Lead is a toxic metal ubiquitous in our environment. While dramatic reductions in lead sources have paralleled equivalent decreases in lead-poisoning rates, chronic lead exposure remains a critical public health concern. Childhood lead exposure (at its lowest levels) is liked to changes in cognitive development but less is known about lead's effects on children's brain structure, especially as a result of in utero exposure. We measured prenatal and early-postnatal lead exposure in shed deciduous teeth of 448 9- and 10-year-old children (from 20 United States cities) and linked those lead levels to childhood brain structure, cognition/behavior, and neighborhood- and family-level socioeconomic characteristics. Here we show negative associations between tooth-lead levels and the thickness of the brain's cortex, particularly in regions linked to language processing. With increasing tooth-lead levels, children of lower-income (versus higher-income) families showed steeper declines in receptive vocabulary. Caregiver-reported behavioral problems exhibited similar associations. With in utero exposure linked to adverse neurodevelopmental outcomes (well before lead exposure and its risks are evaluated by healthcare professionals), prenatal screening of maternal lead levels/exposure, coupled with recommended strategies to reduce its placental transmission, may help reduce lead's effects on future generations.
Bressman, E.; Auerbach, A.; Keniston, A.; Jens, C.; Ranji, S.
Show abstract
Introduction: The use of artificial intelligence (AI) by clinicians has increased rapidly in recent years, with large language models (LLMs) emerging as tools that can equal clinician diagnostic performance in simulated settings. However, limited data exist regarding physicians use of LLMs in real-world clinical practice. This study aimed to evaluate the frequency of LLM use among practicing hospitalists, identify which LLMs are most commonly utilized, and assess hospitalists' perceptions of the benefits and limitations of LLM use in clinical care. Methods: We conducted a cross-sectional survey study of academic hospital medicine faculty across 8 institutions within the Hospital Medicine Reengineering Network (HOMERuN), a collaborative research consortium. Eligible participants included hospitalists practicing within participating HOMERuN sites during the study period. The survey assessed the frequency of LLM use, types of LLMs used, clinical applications, and physician perceptions regarding usefulness, efficiency, and concerns associated with LLM adoption. Results: 170 respondents (67.1%) reported ever using an LLM in clinical practice. Among LLM users, OpenEvidence was the most used tool (88.9%), followed by ChatGPT (58.5%), Google Gemini (26.9%), and Microsoft Copilot (20.5%). Only a minority of hospitalists reported using LLMs daily while seeing patients. The most common use cases of LLMs were answering diagnostic (77.1%) and management (77.6%) questions. A majority also reported using LLMs to identify or summarize primary literature (60.0%). Lack of trust in outputs (49.8%), uncertainty around institutional policies (48.6%), and lack of access to secure applications (43.1%) were cited as the most frequent barriers to using LLMs in practice. Discussion: The use of LLMs in clinical practice is already widespread, though regular or daily use is not yet typical. Concerns regarding reliability, patient privacy, and safe integration into clinical workflows remain significant barriers to broader adoption. The responsible implementation of LLMs in hospital medicine will require addressing these barriers.
Tuttle, M.; Maas, C. C. H. M.; An, J.; Wessler, B. S.; Harvey, W. F.; Selker, H. P.; van Klaveren, D.; Kent, D. M.
Show abstract
The Epic Sepsis Model version 2 (ESMv2) is a prediction model embedded into the electronic medical record used to warn clinicians which hospitalized patients are at risk for sepsis. We conducted a retrospective cohort study of 31,951 hospitalizations of 25,760 patients to compare analyses conducted at the commonly used patient-level (where a maximum prediction prior to the onset of sepsis is used to measure performance) vs novel prediction-level (where each prediction is used to measure performance). Sepsis, defined by the Sepsis 3 criteria occurred during 1,049 hospitalizations (3.3%). Patient-level analyses suggested excellent discrimination AUC 0.86; [IQR 0.85, 0.87], whereas prediction-level analyses demonstrated lower performance AUC 0.62; [IQR 0.57, 0.65]. Low estimates of the positive predictive value (14.5% at the patient level vs 4% at the prediction level) imply a high number of false alerts. Common evaluation approaches may overstate the performance of dynamic prediction models and mislead clinical decision-making.
Powell, B. C.; Amendola, L. M.; Bonini, K. E.; Crosslin, D.; Desrosiers-Battu, L.; Hiatt, S. M.; Hindorff, L.; Kenny, E. E.; Mavura, Y.; Muenzen Ferar, K. D.; Risch, N.; Roman, T.; Slavotinek, A.; Van Ziffle, J.; Bowling, K. M.
Show abstract
Yield of reported results from genetic testing provides a proximal measure of clinical usefulness. While ACMG/AMP guidelines provide representations of uncertainty for individual genetic variant classification, additional factors are considered when determining whether results explain a patient's presentation. To standardize cross-consortium analysis, a working group of the Clinical Sequencing Evidence-Generating Research (CSER2) consortium iteratively identified factors used when contextualizing variant-level results to case-level interpretation (i.e., interpretation of an individual's genetic data with respect to the indication for testing). Sites independently categorized results; complex cases were discussed collaboratively, leading to revision of classification categories. Our metric incorporates factors beyond classification of reported variants. Analogous to variant-level results, "Definitive Positive" and "Probable Positive" represent certainty that results may be clinically explanatory. The category "Inconclusive" applies when results may or may not fully explain the patient presentation, with subdivision into multiple (non-exclusive) subcategories. Cases falling outside all of the other categories are considered "Negative". The overall diagnostic yield by this metric and use of categories for inconclusive results varied by CSER project, in part paralleling study design differences. This case-level categorization provides a meaningful assessment of diagnostic yield, and for inconclusive cases identifies potentially resolvable factors for case resolution.